In this analysis different portfolios consisting of stocks that are among the top 10 constituents of the MSCI Spain Index are backtested using the functionality of the portvine package. First the most important packages are loaded and after that the data will be imported and discussed shortly.

library(portvine)
library(tidyverse)
# utility color vector for visualizations
yes_no_cols <- c("#92B8DE", "#db4f59")
theme_set(
  theme_minimal() +
  theme(plot.title = ggtext::element_markdown(size = 11),
        plot.subtitle = ggtext::element_markdown(size = 9))
)
# load the data
load(here::here("data", "msci_spain_data_clean.RData"))

A first glimpse at the data

glimpse(msci_spain_data)
## Rows: 3,089
## Columns: 11
## $ date             <dttm> 2010-01-05, 2010-01-06, 2010-01-07, 2010-01-08, 2010~
## $ msci_spain_index <dbl> 7.227127e-03, -3.519919e-03, -3.271415e-03, -7.783202~
## $ iberdrola        <dbl> 0.0011983863, -0.0013316290, -0.0064166196, -0.001046~
## $ banco_santander  <dbl> 0.010994558, 0.007111767, -0.004592700, 0.003339731, ~
## $ inditex          <dbl> -0.0066202730, 0.0004579803, -0.0147604156, 0.0034790~
## $ cellnex_telecom  <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ repsol_ypf       <dbl> 0.0029173273, -0.0045062036, 0.0005267713, -0.0010632~
## $ ferrovial        <dbl> 0.013137950, 0.032525280, -0.005260778, 0.003207450, ~
## $ amadeus_it_group <dbl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N~
## $ telefonica       <dbl> -0.0015128243, -0.0073567397, -0.0099731076, -0.01763~
## $ bbv_argentaria   <dbl> 0.0069158977, 0.0038205007, -0.0045843707, 0.00763226~

One can see that there are 11 columns. The date column gives obviously the date as daily return data will be analyzed here. The daily log returns of the overall index are given in the column msci_spain_index and all other columns contain the daily log returns of the respective stock. One can also detect missing values which can be nicely observed in the visualization below.

This makes total sense as for example Cellnex Telecom stocks are traded since 2015, so there can not be earlier return data. So one could now either impute the values in order to use the whole time frame but here the more simplistic approach of just looking at a time frame without any missing values is applied. This reduced data set is available as msci_spain_complete_data and contains only complete cases.

summary(msci_spain_complete_data$date)
##                  Min.               1st Qu.                Median 
## "2015-05-08 00:00:00" "2016-12-21 18:00:00" "2018-08-07 12:00:00" 
##                  Mean               3rd Qu.                  Max. 
## "2018-08-07 16:52:04" "2020-03-24 06:00:00" "2021-11-08 00:00:00"
nrow(msci_spain_complete_data)
## [1] 1696

This means that the time frame is from the 8th Mai of 2015 until the 8th November of 2021 and there are 1696 observations. Below one can have a look at daily log returns of the overall MSCI Spain index.

So one can observe greater volatility during the stock market selloffs 2015-2016 maybe due to Chinese stock market turbulences, the EU dept crisis and the Brexit votum as well as for the first ‘Covid-19 year’ 2020. This should give the opportunity to access how robust the risk measure estimates are during a crisis. Now specify the first portfolio and analyze it unconditionally as well as conditionally.

The missing leader portfolio missl

Here one picks as the stock portfolio equally weighted all the stocks without the one with the highest market capitalization that makes roughly 17% of the MSCI Spain i.e. Iberdrola, that is mainly in the energy business. First one models this portfolio and estimates the risk estimates unconditionally and then conditionally on the biggest player Iberdrola.

Unconditional missl portfolio

First specify suitable marginal model settings as well as settings for the vine copula.

missl_marginal_settings <- marginal_settings(
  train_size = 1000,
  refit_size = 50
)

missl_vine_settings <- vine_settings(
  train_size = 200,
  refit_size = 25,
  family_set = "all",
  vine_type = "rvine"
)

To compare run times one estimates the risk once sequentially and once in parallel.

# sequentially
missl_risk_roll_seq <- estimate_risk_roll(
  msci_spain_complete_data %>% select(-c(date, msci_spain_index, iberdrola)),
  weights = NULL,
  marginal_settings = missl_marginal_settings,
  vine_settings = missl_vine_settings,
  alpha = c(0.01, 0.05, 0.1),
  risk_measures = c("VaR", "ES_mean", "ES_median", "ES_mc"),
  n_samples = 1000,
  n_mc_samples = 1000,
  trace = TRUE
)
## The last window of interest is shorter (width:  46 ) than the specified window width of 50
## 
## Fit marginal models:
## banco_santander  inditex  cellnex_telecom  repsol_ypf  ferrovial  amadeus_it_group  telefonica  bbv_argentaria  
## 
## Fit vine copula models and estimate risk.
## Vine windows:
## (1/28) (2/28) (3/28) (4/28) (5/28) (6/28) (7/28) (8/28) (9/28) (10/28) (11/28) (12/28) (13/28) (14/28) (15/28) (16/28) (17/28) (18/28) (19/28) (20/28) (21/28) (22/28) (23/28) (24/28) (25/28) (26/28) (27/28) (28/28)
# 4 parallel R session
future::plan("multisession", workers = 4)
missl_risk_roll <- estimate_risk_roll(
  msci_spain_complete_data %>% select(-c(date, msci_spain_index, iberdrola)),
  weights = NULL,
  marginal_settings = missl_marginal_settings,
  vine_settings = missl_vine_settings,
  alpha = c(0.01, 0.05, 0.1),
  risk_measures = c("VaR", "ES_mean", "ES_median", "ES_mc"),
  n_samples = 1000,
  n_mc_samples = 1000,
  trace = TRUE
)
## The last window of interest is shorter (width:  46 ) than the specified window width of 50
## 
## Fit marginal models:
## banco_santander  inditex  cellnex_telecom  repsol_ypf  ferrovial  amadeus_it_group  telefonica  bbv_argentaria  
## 
## Fit vine copula models and estimate risk.
## Vine windows:
## (1/28) (2/28) (3/28) (4/28) (5/28) (6/28) (7/28) (8/28) (9/28) (10/28) (11/28) (12/28) (13/28) (14/28) (15/28) (16/28) (17/28) (18/28) (19/28) (20/28) (21/28) (22/28) (23/28) (24/28) (25/28) (26/28) (27/28) (28/28)
future::plan("sequential")

The results and their respective run times can be seen below.

missl_risk_roll_seq
## An object of class <portvine_roll>
## Number of ARMA-GARCH/ marginal windows: 14 
## Number of vine windows: 28 
## Risk measures estimated: VaR ES_mean ES_median ES_mc 
## Alpha levels used: 0.01 0.05 0.1 
## 
## Time taken: 6.4683 minutes
missl_risk_roll
## An object of class <portvine_roll>
## Number of ARMA-GARCH/ marginal windows: 14 
## Number of vine windows: 28 
## Risk measures estimated: VaR ES_mean ES_median ES_mc 
## Alpha levels used: 0.01 0.05 0.1 
## 
## Time taken: 2.1712 minutes
# thus the speedup is already for 4 workers
missl_risk_roll_seq@time_taken / missl_risk_roll@time_taken
## [1] 2.979128
# besides the print method one can also call the summary for more details
summary(missl_risk_roll)
## An object of class <portvine_roll>
## 
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 14 
## Train size:  1000 
## Refit size:  50 
## 
## --- Vine copula models ---
## Number of vine windows: 28 
## Train size:  200 
## Refit size:  25 
## Vine copula type:  rvine 
## Vine family set:  all 
## 
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_median ES_mc 
## Alpha levels used: 0.01 0.05 0.1 
## Number of estimated risk measures: 8352 
## Number of samples for each risk estimation: 1000 
## 
## Time taken: 2.1712 minutes.

Inspect marginal and vine copula model fits

Now one can use the accessor functions to extract the fitted marginal as well as the vine copula models.

### get the fitted marginal models
missl_fitted_marginals <- fitted_marginals(missl_risk_roll)
# for each asset one now can access the uGARCHroll object that for example 
# contains the fitted coefficients for the second rolling window 
missl_fitted_marginals$banco_santander@model$coef[[2]]$coef
##             Estimate   Std. Error    t value   Pr(>|t|)
## mu     -3.958998e-04 5.326216e-04 -0.7433042 0.45729755
## ar1     4.061090e-01 2.231842e-01  1.8196139 0.06881782
## ma1    -3.838058e-01 2.235111e-01 -1.7171666 0.08594874
## omega   6.565105e-06 2.742517e-05  0.2393824 0.81080906
## alpha1  5.782711e-02 5.768279e-02  1.0025020 0.31610118
## beta1   9.225192e-01 4.234043e-02 21.7881395 0.00000000
## skew    9.708474e-01 3.740128e-02 25.9575985 0.00000000
## shape   6.337415e+00 3.180092e+00  1.9928400 0.04627897
### get the fitted vine copula models
missl_fitted_vines <- fitted_vines(missl_risk_roll)
# then one can for example have a look at how the fitted vine structure evolved
#  over time by plotting the first tree for different rolloing windows
rvinecopulib:::plot.vinecop(missl_fitted_vines[[1]])

rvinecopulib:::plot.vinecop(missl_fitted_vines[[28]])

Analyze estimated risk measures

Now finally the risk estimates are going to be analyzed. First one compares the 3 different methods of estimating the Expected Shortfall.

Also one might compare the number of exceedances.

risk_estimates(missl_risk_roll,
               risk_measures = c("ES_mean", "ES_median", "ES_mc"),
               alpha = 0.05,
               exceeded = TRUE) %>%
  group_by(risk_measure) %>%
  summarise(relative_exceedances = mean(exceeded))

Here actually we find an indication for the fact that the values that fell below the corresponding VaR where left skewed which leads to the fact that the mean will be a more conservative estimate in those cases which might explain the slightly lower exceedance rate.

Exceedance plots

risk_estimates(missl_risk_roll,
               risk_measures = c("ES_mean", "VaR"),
               alpha = 0.01) %>%
  ggplot() +
  geom_line(aes(x = row_num, y = realized), col = "lightgrey") +
  geom_line(aes(x = row_num, y = risk_est,
                col = risk_measure, linetype = risk_measure),
            size = 0.5) +
  labs(x = "estimation window", y = "portfolio log returns",
       linetype = "Risk measure",
       title = "Comparison of risk measure behaivior for alpha level 0.01") +
  scale_color_manual(values = c(yes_no_cols[1], "#477042"),
                     name = "Risk measure") +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank())

risk_estimates(missl_risk_roll,
               risk_measures = c("VaR"),
               alpha = 0.01,
               exceeded = TRUE) %>%
  ggplot() +
  geom_line(aes(x = row_num, y = realized), col = "lightgrey") +
  geom_point(aes(x = row_num, y = realized, col = exceeded)) +
  geom_line(aes(x = row_num, y = risk_est), col = yes_no_cols[1]) +
  scale_color_manual(values = c("lightgrey", yes_no_cols[2])) +
  labs(x = "estimation window",
       y = "portfolio log returns",
       col = "Exceeded",
       subtitle = paste0("Exceedances are highlighted in ",
                     "<span style='color:",
                     yes_no_cols[2],
                     "'>**red**</span>",
                     "."),
       title = "Risk measure: VaR with alpha level 0.01"
       ) +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    legend.position = "none")

risk_estimates(missl_risk_roll,
               risk_measures = c("ES_mean"),
               alpha = 0.01,
               exceeded = TRUE) %>%
  ggplot() +
  geom_line(aes(x = row_num, y = realized), col = "lightgrey") +
  geom_point(aes(x = row_num, y = realized, col = exceeded)) +
  geom_line(aes(x = row_num, y = risk_est), col = yes_no_cols[1]) +
  scale_color_manual(values = c("lightgrey", yes_no_cols[2])) +
  labs(x = "estimation window",
       y = "portfolio log returns",
       col = "Exceeded",
       subtitle = paste0("Exceedances are highlighted in ",
                     "<span style='color:",
                     yes_no_cols[2],
                     "'>**red**</span>",
                     "."),
       title = "Risk measure: ES (mean estimation) with alpha level 0.01"
       ) +
  theme(
    axis.text.x = element_blank(),
    axis.ticks.x = element_blank(),
    legend.position = "none")

Conditional missl portfolio

Estimation is performed in the same fashion as above with some additional arguments concerning the conditioning on the stock Iberdrola.

missl_cond_marginal_settings <- marginal_settings(
  train_size = 1000,
  refit_size = 50
)

missl_cond_vine_settings <- vine_settings(
  train_size = 200,
  refit_size = 25,
  family_set = "parametric",
  vine_type = "dvine"
)
future::plan("multisession", workers = 6)
missl_cond_risk_roll <- estimate_risk_roll(
  msci_spain_complete_data %>% select(-c(date, msci_spain_index)),
  weights = NULL,
  marginal_settings = missl_cond_marginal_settings,
  vine_settings = missl_cond_vine_settings,
  alpha = c(0.01, 0.05, 0.1),
  risk_measures = c("VaR", "ES_mean", "ES_median", "ES_mc"),
  n_samples = 1000,
  n_mc_samples = 1000,
  cond_vars = "iberdrola",
  cond_alpha = c(0.05, 0.5),
  trace = TRUE
)
## The last window of interest is shorter (width:  46 ) than the specified window width of 50
## 
## Fit marginal models:
## iberdrola  banco_santander  inditex  cellnex_telecom  repsol_ypf  ferrovial  amadeus_it_group  telefonica  bbv_argentaria  
## 
## Fit vine copula models and estimate risk.
## Vine windows:
## (1/28) (2/28) (3/28) (4/28) (5/28) (6/28) (7/28) (8/28) (9/28) (10/28) (11/28) (12/28) (13/28) (14/28) (15/28) (16/28) (17/28) (18/28) (19/28) (20/28) (21/28) (22/28) (23/28) (24/28) (25/28) (26/28) (27/28) (28/28)
future::plan("sequential")
summary(missl_cond_risk_roll)
## An object of class <cond_portvine_roll>
## 
## --- Conditional settings ---
## Conditional variable(s): iberdrola 
## Number of conditional estimated risk measures: 16704 
## Conditioning quantiles: 0.05 0.5 
## 
## --- Marginal models ---
## Number of ARMA-GARCH/ marginal windows: 14 
## Train size:  1000 
## Refit size:  50 
## 
## --- Vine copula models ---
## Number of vine windows: 28 
## Train size:  200 
## Refit size:  25 
## Vine copula type:  dvine 
## Vine family set:  parametric 
## 
## --- Risk estimation ---
## Risk measures estimated: VaR ES_mean ES_median ES_mc 
## Alpha levels used: 0.01 0.05 0.1 
## Number of estimated risk measures: 8352 
## Number of samples for each risk estimation: 1000 
## 
## Time taken: 9.0003 minutes.